A holistic view of stream partitioning costs

نویسندگان

  • Nikos R. Katsipoulakis
  • Alexandros Labrinidis
  • Panos K. Chrysanthis
چکیده

Stream processing has become the dominant processing model for monitoring and real-time analytics. Modern Parallel Stream Processing Engines (pSPEs) have made it feasible to increase the performance in both monitoring and analytical queries by parallelizing a query’s execution and distributing the load on multiple workers. A determining factor for the performance of a pSPE is the partitioning algorithm used to disseminate tuples to workers. Until now, partitioning methods in pSPEs have been similar to the ones used in parallel databases and only recently load-aware algorithms have been employed to improve the effectiveness of parallel execution. We identify and demonstrate the need to incorporate aggregation costs in the partitioning model when executing stateful operations in parallel, in order to minimize the overall latency and/or throughput. Towards this, we propose new stream partitioning algorithms, that consider both tuple imbalance and aggregation cost. We evaluate our proposed algorithms and show that they can achieve up to an order of magnitude better performance, compared to the current state of the art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

(Re)partitioning for stream-enabled computation

Partitioning an input graph over a set of workers is a complex operation. Objectives are twofold: split the work evenly, so that every worker gets an equal share, and minimize edge cut to achieve a good work locality (i.e. workers can work independently). Partitioning a graph accessible from memory is a notorious NP-complete problem. Motivated by the regain of interest for the stream processing...

متن کامل

Holistic View in Medicine

Modern medicine was born in modernism period around two centuries ago with materialistic view in general and pure biologic approach to health and disease. This new kind of approach was based on the wide spread philosophic view of that era especially the hypotheses of Claude Bernard. As our understanding of human biology has tremendously progressed, and the emergence of postmodernism era has o...

متن کامل

Holistic distributed stream clustering for smart grids

Smart grids consist of millions of automated electronic meters that will be installed in electricity distribution networks and connected to servers that will manage grid supervision, billing and customer services. World sustainability regarding energy management will definitely rely on such grids, so smart grids need also to be sustainable themselves. This sustainability depends on several rese...

متن کامل

The Sector–stream Matrix: Introducing a New Framework for the Analysis of Environmental Performance1

Environmental strategy is currently in transition from a reductionist view of individual technologies in isolation to a holistic and interdisciplinary view of the relationship between society, technology, and environmental impact. As a contribution to this larger effort, this paper uses a systems analytic approach to develop a ‘sector-stream matrix’ of functions and objectives that have an impa...

متن کامل

Cockpit Crew Pairing Problem in Airline Scheduling: Shortest Path with Resources Constraints Approach

Increasing competition in the air transport market has intensified active airlines’ efforts to keep their market share by attaching due importance to cost management aimed at reduced final prices. Crew costs are second only to fuel costs on the cost list of airline companies. So, this paper attempts to investigate the cockpit crew pairing problem. The set partitioning problem has been used for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2017